45 research outputs found

    FEMPAR: Scaling Multi-Level Domain Decomposition up to the full JUQUEEN supercomputer

    Get PDF
    Type of production: technical report (JUQUEEN Extreme Scaling Workshop, Juelich Supercomputing Center, Germany, 2015). Format: contributed chapter.In conjunction with this year's JUQUEEN Porting and Tuning Workshop, which is part of the PRACE Advanced Training Centres curriculum, JSC continued its series of BlueGene Extreme Scaling Workshops. Seven application teams were invited to stay for two days and work on the scalability of their codes, with dedicated access to the entire JUQUEEN system for a period of 30 hours. Most of the teams' codes had thematic overlap with JSC Simulation Laboratories or were part of an ongoing collaboration with one of the SimLabs. The code teams came from the fields of climate science (ICON from DKRZ, and MPAS-A from KIT and NCAR), engineering (FEMPAR from UPC, and ex_nl/FE^2 from Uni Cologne and TU Freiberg), fluid dynamics (psOpen and SHOCK both from RWTH Aachen), and neuroscience (CoreNeuron from the EPFL Blue Brain Project) and were supported by JSC SimLabs and Cross-sectional teams, with IBM and JUQUEEN technical support. Within the first 24 hours of dedicated access to the entire 28 racks, all seven teams had adapted their codes and datasets to exploit the massive parallelism and restricted node memory for successful executions using all 458,752 cores. Most of them also demonstrated excellent strong or weak scalability, qualifying all but one for the High-Q Club. A total of 370 'large' jobs were executed using 12 of the 15 million core-hours of compute time allocated for the workshop. Detailed results for each code, provided by the application teams themselves, is introduced by analysis comparing them to the other 16 High-Q Club codes.Preprin

    FEMPAR: an object-oriented parallel finite element framework

    Get PDF
    FEMPAR is an open source object oriented Fortran200X scientific software library for the high-performance scalable simulation of complex multiphysics problems governed by partial differential equations at large scales, by exploiting state-of-the-art supercomputing resources. It is a highly modularized, flexible, and extensible library, that provides a set of modules that can be combined to carry out the different steps of the simulation pipeline. FEMPAR includes a rich set of algorithms for the discretization step, namely (arbitrary-order) grad, div, and curl-conforming finite element methods, discontinuous Galerkin methods, B-splines, and unfitted finite element techniques on cut cells, combined with h-adaptivity. The linear solver module relies on state-of-the-art bulk-asynchronous implementations of multilevel domain decomposition solvers for the different discretization alternatives and block-preconditioning techniques for multiphysics problems. FEMPAR is a framework that provides users with out-of-the-box state-of-the-art discretization techniques and highly scalable solvers for the simulation of complex applications, hiding the dramatic complexity of the underlying algorithms. But it is also a framework for researchers that want to experience with new algorithms and solvers, by providing a highly extensible framework. In this work, the first one in a series of articles about FEMPAR, we provide a detailed introduction to the software abstractions used in the discretization module and the related geometrical module. We also provide some ingredients about the assembly of linear systems arising from finite element discretizations, but the software design of complex scalable multilevel solvers is postponed to a subsequent work.Peer ReviewedPostprint (published version

    Block recursive LU preconditioners for the thermally coupled incompressible inductionless MHD problem

    Get PDF
    The thermally coupled incompressible inductionless magnetohydrodynamics (MHD) problem models the ow of an electrically charged fuid under the in uence of an external electromagnetic eld with thermal coupling. This system of partial di erential equations is strongly coupled and highly nonlinear for real cases of interest. Therefore, fully implicit time integration schemes are very desirable in order to capture the di erent physical scales of the problem at hand. However, solving the multiphysics linear systems of equations resulting from such algorithms is a very challenging task which requires e cient and scalable preconditioners. In this work, a new family of recursive block LU preconditioners is designed and tested for solving the thermally coupled inductionless MHD equations. These preconditioners are obtained after splitting the fully coupled matrix into one-physics problems for every variable (velocity, pressure, current density, electric potential and temperature) that can be optimally solved, e.g., using preconditioned domain decomposition algorithms. The main idea is to arrange the original matrix into an (arbitrary) 2 2 block matrix, and consider a LU preconditioner obtained by approximating the corresponding Schur complement. For every one of the diagonal blocks in the LU preconditioner, if it involves more than one type of unknown, we proceed the same way in a recursive fashion. This approach is stated in an abstract way, and can be straightforwardly applied to other multiphysics problems. Further, we precisely explain a fexible and general software design for the code implementation of this type of preconditioners.Preprin

    The aggregated unfitted finite element method for elliptic problems

    Get PDF
    Unfitted finite element techniques are valuable tools in different applications where the generation of body-fitted meshes is difficult. However, these techniques are prone to severe ill conditioning problems that obstruct the efficient use of iterative Krylov methods and, in consequence, hindersthe practical usage of unfitted methods for realistic large scale applications. In this work, we present a technique that addresses such conditioning problems by constructing enhanced finite element spaces based on a cell aggregation technique. The presented method, called aggregated unfitted finite element method, is easy to implement, and can be used, in contrast to previous works, in Galerkin approximations of coercive problems with conforming Lagrangian finite element spaces. The mathematical analysis of the method states that the condition number of the resulting linear system matrix scales as in standard finite elements for body-fitted meshes, without being affected by small cut cells, and that the method leads to the optimal finite element convergence order. These theoretical results are confirmed with 2D and 3D numerical experiments.Peer ReviewedPostprint (author's final draft

    Balancing domain decomposition by constraints associated with subobjects

    Get PDF
    A simple variant of the BDDC preconditioner in which constraints are imposed on a selected set of subobjects (subdomain subedges, subfaces and vertices between pairs of subedges) is presented. We are able to show that the condition number of the preconditioner is bounded by C(1+log(L/h))2, where C is a constant, and h and L are the characteristic sizes of the mesh and the subobjects, respectively. As L can be chosen almost freely, the condition number can theoretically be as small as O(1). We will discuss the pros and cons of the preconditioner and its application to heterogeneous problems. Numerical results on supercomputers are provided.Peer ReviewedPostprint (author's final draft

    Simulation of high temperature superconductors and experimental validation

    Get PDF
    In this work, we present a parallel, fully-distributed finite element numerical framework to simulate the low-frequency electromagnetic behaviour of superconducting devices, which efficiently exploits high performance computing platforms. We select the so-called H-formulation, which uses the magnetic field as a state variable. Nédélec elements (of arbitrary order) are required for an accurate approximation of the H-formulation for modelling electromagnetic fields along interfaces between regions with high contrast medium properties. An h-adaptive mesh refinement technique customized for Nédélec elements leads to a structured fine mesh in areas of interest whereas a smart coarsening is obtained in other regions. The composition of a tailored, robust, parallel nonlinear solver completes the exposition of the developed tools to tackle the problem. First, a comparison against experimental data is performed to show the availability of the finite element approximation to model the physical phenomena. Then, a selected state-of-the-art 3D benchmark is reproduced, focusing on the parallel performance of the algorithms.Peer ReviewedPostprint (author's final draft

    Physics-based balancing domain decomposition by constraints for multi-material problems

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10915-018-0870-zIn this work, we present a new variant of the balancing domain decomposition by constraints preconditioner that is robust for multi-material problems. We start with a well-balanced subdomain partition, and based on an aggregation of elements according to their physical coefficients, we end up with a finer physics-based (PB) subdomain partition. Next, we define corners, edges, and faces for this PB partition, and select some of them to enforce subdomain continuity (primal faces/edges/corners). When the physical coefficient in each PB subdomain is constant and the set of selected primal faces/edges/corners satisfy a mild condition on the existence of acceptable paths, we can show both theoretically and numerically that the condition number does not depend on the contrast of the coefficient across subdomains. An extensive set of numerical experiments for 2D and 3D for the Poisson and linear elasticity problems is provided to support our findings. In particular, we show robustness and weak scalability of the new preconditioner variant up to 8232 cores when applied to 3D multi-material problems with the contrast of the physical coefficient up to 108 and more than half a billion degrees of freedom. For the scalability analysis, we have exploited a highly scalable advanced inter-level overlapped implementation of the preconditioner that deals very efficiently with the coarse problem computation. The proposed preconditioner is compared against a state-of-the-art implementation of an adaptive BDDC method in PETSc for thermal and mechanical multi-material problems.Peer ReviewedPostprint (author's final draft

    Enhanced balancing Neumann-Neumann preconditioning in computational fluid and solid mechanics

    Get PDF
    Manuscript submitted for publication in International Journal for Numerical Methods in Engineering. Under review.Preprin

    On the scalability of inexact balancing domain decomposition by constraints with overlapped coarse/fine corrections

    Get PDF
    In this work, we analyze the scalability of inexact two-level balancing domain decomposition by constraints (BDDC) preconditioners for Krylov subspace iterative solvers, when using a highly scalable asynchronous parallel implementation where fine and coarse correction computations are overlapped in time. This way, the coarse-grid problem can be fully overlapped by fine-grid computations (which are embarrassingly parallel) in a wide range of cases. Further, we consider inexact solvers to reduce the computational cost/complexity and memory consumption of coarse and local problems and boost the scalability of the solver. Out of our numerical experimentation, we conclude that the BDDC preconditioner is quite insensitive to inexact solvers. In particular, one cycle of algebraic multigrid (AMG) is enough to attain algorithmic scalability. Further, the clear reduction of computing time and memory requirements of inexact solvers compared to sparse direct ones makes possible to scale far beyond state-of-the-art BDDC implementations. Excellent weak scalability results have been obtained with the proposed inexact/overlapped implementation of the two-level BDDC preconditioner, up to 93,312 cores and 20 billion unknowns on JUQUEEN. Further, we have also applied the proposed setting to unstructured meshes and partitions for the pressure Poisson solver in the backward-facing step benchmark domain.Peer ReviewedPostprint (author's final draft
    corecore